41 research outputs found
Bayesian Learning of Sum-Product Networks
Sum-product networks (SPNs) are flexible density estimators and have received
significant attention due to their attractive inference properties. While
parameter learning in SPNs is well developed, structure learning leaves
something to be desired: Even though there is a plethora of SPN structure
learners, most of them are somewhat ad-hoc and based on intuition rather than a
clear learning principle. In this paper, we introduce a well-principled
Bayesian framework for SPN structure learning. First, we decompose the problem
into i) laying out a computational graph, and ii) learning the so-called scope
function over the graph. The first is rather unproblematic and akin to neural
network architecture validation. The second represents the effective structure
of the SPN and needs to respect the usual structural constraints in SPN, i.e.
completeness and decomposability. While representing and learning the scope
function is somewhat involved in general, in this paper, we propose a natural
parametrisation for an important and widely used special case of SPNs. These
structural parameters are incorporated into a Bayesian model, such that
simultaneous structure and parameter learning is cast into monolithic Bayesian
posterior inference. In various experiments, our Bayesian SPNs often improve
test likelihoods over greedy SPN learners. Further, since the Bayesian
framework protects against overfitting, we can evaluate hyper-parameters
directly on the Bayesian model score, waiving the need for a separate
validation set, which is especially beneficial in low data regimes. Bayesian
SPNs can be applied to heterogeneous domains and can easily be extended to
nonparametric formulations. Moreover, our Bayesian approach is the first, which
consistently and robustly learns SPN structures under missing data.Comment: NeurIPS 2019; See conference page for supplemen
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Conditional Sum-Product Networks: Imposing Structure on Deep Probabilistic Architectures
Probabilistic graphical models are a central tool in AI; however, they are
generally not as expressive as deep neural models, and inference is notoriously
hard and slow. In contrast, deep probabilistic models such as sum-product
networks (SPNs) capture joint distributions in a tractable fashion, but still
lack the expressive power of intractable models based on deep neural networks.
Therefore, we introduce conditional SPNs (CSPNs), conditional density
estimators for multivariate and potentially hybrid domains which allow
harnessing the expressive power of neural networks while still maintaining
tractability guarantees. One way to implement CSPNs is to use an existing SPN
structure and condition its parameters on the input, e.g., via a deep neural
network. This approach, however, might misrepresent the conditional
independence structure present in data. Consequently, we also develop a
structure-learning approach that derives both the structure and parameters of
CSPNs from data. Our experimental evidence demonstrates that CSPNs are
competitive with other probabilistic models and yield superior performance on
multilabel image classification compared to mean field and mixture density
networks. Furthermore, they can successfully be employed as building blocks for
structured probabilistic models, such as autoregressive image models.Comment: 13 pages, 6 figure
How to Turn Your Knowledge Graph Embeddings into Generative Models
Some of the most successful knowledge graph embedding (KGE) models for link
prediction -- CP, RESCAL, TuckER, ComplEx -- can be interpreted as energy-based
models. Under this perspective they are not amenable for exact
maximum-likelihood estimation (MLE), sampling and struggle to integrate logical
constraints. This work re-interprets the score functions of these KGEs as
circuits -- constrained computational graphs allowing efficient
marginalisation. Then, we design two recipes to obtain efficient generative
circuit models by either restricting their activations to be non-negative or
squaring their outputs. Our interpretation comes with little or no loss of
performance for link prediction, while the circuits framework unlocks exact
learning by MLE, efficient sampling of new triples, and guarantee that logical
constraints are satisfied by design. Furthermore, our models scale more
gracefully than the original KGEs on graphs with millions of entities
Learning Deep Mixtures of Gaussian Process Experts Using Sum-Product Networks
While Gaussian processes (GPs) are the method of choice for regression tasks,
they also come with practical difficulties, as inference cost scales cubic in
time and quadratic in memory. In this paper, we introduce a natural and
expressive way to tackle these problems, by incorporating GPs in sum-product
networks (SPNs), a recently proposed tractable probabilistic model allowing
exact and efficient inference. In particular, by using GPs as leaves of an SPN
we obtain a novel flexible prior over functions, which implicitly represents an
exponentially large mixture of local GPs. Exact and efficient posterior
inference in this model can be done in a natural interplay of the inference
mechanisms in GPs and SPNs. Thereby, each GP is -- similarly as in a mixture of
experts approach -- responsible only for a subset of data points, which
effectively reduces inference cost in a divide and conquer fashion. We show
that integrating GPs into the SPN framework leads to a promising probabilistic
regression model which is: (1) computational and memory efficient, (2) allows
efficient and exact posterior inference, (3) is flexible enough to mix
different kernel functions, and (4) naturally accounts for non-stationarities
in time series. In a variate of experiments, we show that the SPN-GP model can
learn input dependent parameters and hyper-parameters and is on par with or
outperforms the traditional GPs as well as state of the art approximations on
real-world data